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Abstract 



Current dynamical overlap fermion hybrid Monte Carlo simulations encounter large 

fermionic forces when there is mixing between eigenvectors of the kernel operator 

with near zero-eigenvalues. This leads to low acceptance rates when there is a large 

density of near zero eigenvalues. I present a method where these large forces are 

. eliminated and the large action jumps seen when two eigenvalues approach zero are 

significantly reduced. This significantly increases the stability of the algorithm, and 

CO ■ allows the use of larger integration time steps. 
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1 Introduction 



The overlap Dirac operator [1,2], which unlike other formulations of lattice 
QCD has an exact lattice chiral symmetry [3] and a corresponding index the- 
orem, offers numerous exciting possibilities for research in dynamical lattice 
QCD [4,5,6,7]; but presents a number of distinct challenges. The first challenge 
is the numerical cost, but this is not insurmountable on modern computers. 
Today simulations on 16 3 32 lattices are feasible [8], and it will not be long un- 
til large scale simulations will not only be possible but entirely practical and 
commonplace. The other difficulties involve the technical details of the algo- 
rithm, and in this paper I will focus on one of these issues, so far unexplored 
in the literature. 



The overlap operator is defined as 

D = (l + fi) + (l-fi) l5 e(Q), (1) 

where fx is a mass parameter proportional to the bare fermion mass and Q 
is the Hermitian form of a suitable lattice Dirac operator (the kernel) with 
no fermion doublers and negative mass p. In this work, I will always use the 
Wilson operator with p = 1.5, or, alternatively, k — 1/(8 — 2p) = 0.2: 



Qxy 75 
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(2) 



The matrix sign function is defined as 

e(Q)=X)l^)<^|sign(A i ), (3) 

i 

where \ipj) and \ are the eigenvectors and eigenvalues of Q respectively, and 
the sum is over the complete set of eigenvectors. In practice, given that the cal- 
culation of the entire eigenvalue spectrum is impractical, it is usual to use an 
approximation to the sign function, such as the Zolotarev Rational approxima- 
tion [9], for the bulk of the eigenvalue spectrum. The spectral decomposition 
is only used for for the eigenvalues closest to zero, where no approximation 
can (realistically) be accurate enough without a large computational cost. 

In terms of the Hermitian overlap operator H = 75-D, and the gauge action 
S g [U] for a gauge field U, the lattice QCD partition function for two degenerate 
flavours of fermion is 

Z = f dUdet(H 2 [U,p})e- s ^ = J dUd^e- 8 *^ 11 ^, (4) 

where I have used pseudo-fermion fields <fi to approximate the fermion deter- 
minant. The standard Hybrid Monte Carlo (HMC) algorithm [10] generates 
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a new gauge field by introducing a momentum II, updating the momentum 
and gauge field along the classical trajectory using a numerical integration 
algorithm (the molecular dynamics), and finishing with a metropolis step to 
ensure that the update of the gauge field satisfies detailed balance. The nu- 
merical integration must be reversible and ergodic. It does not have to be 
area conserving, but in a non-area conserving molecular dynamics the Jaco- 
bian must be calculated and included in the metropolis accept /reject step, as 
discussed in section 2. 

The numerical integration requires the calculation of a fermionic force, ob- 
tained by differentiating the action with respect to the gauge field. For the 
overlap operator, the action is discontinuous, leading to two problems: firstly 
there is a delta function in the force whenever an eigenvalue of the kernel 
operator, Q, changes sign; and secondly a large peak in the force when two 
eigenvectors, whose eigenvalues have different signs, mix. The first problem 
can be compensated for using the "transmission/reflection" algorithm, first 
published by Zoltan Fodor and collaborators [4], and subsequently improved 
by my own work [5,11]. There are still additional difficulties, particularly the 
rate of topological charge changes at small mass [12] and the volume depen- 
dence of the algorithm [13], but these can be resolved [14,15]. 

The second problem is a little more technical. Until this study, the eigenvalues 
and eigenvectors of a sparse matrix have been differentiated using a procedure 
analogous to first order perturbation theory. This method is outlined in refer- 
ence [11], although the idea is not original to the cited paper. The differential 
of the matrix sign function (neglecting the delta function) with respect to 
the molecular dynamics time r obtained from this method can be expressed 
in terms of the complete basis of eigenvalues and eigenvectors of the kernel 
operator 

j- (i^Dsign^) = E l^)(^-i^Qi^)(^l Sign( ^ I 8 ^ {Xj) . (5) 

It is clear that there is a large differential, and thus large fermionic force, 
when there is a pair of eigenvalues close to zero, but with different signs (see 
figure 1). I refer to this as the "eigenvalue mixing problem" for reasons that 
shall become obvious later. So far dynamical overlap simulations have tried 
to avoid this problem by suppressing the number of small eigenvalues of the 
kernel Dirac operator, either by smearing [6,16], or by adding an additional 
term to the action [17] Neither of these methods are satisfactory: too much 
smearing will distort the physics, and will not remove the problem on suf- 
ficiently large volumes. The algorithm with the additional term may not be 
ergodic if topological sectors are either not internally connected, or (if they 
are connected) the computer time required to evolve to a different sub-sector 

2 These two approaches also have the advantage of accelerating the computation. 
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Fig. 1. The trace of the square of the fermionic force (top), molecular dynamics 
energy (middle), and the Wilson operator eigenvalues (bottom) across one trajectory 
on an 8 3 16 ensemble with mass fi = 0.03, time-step r = 0.01, two pseudo-fermion 
fields, and two steps of stout smearing with parameter 0.1. By explicit calculation 
during the molecular dynamics, I observed that there was no exactly zero Wilson 
eigenvalue between between the 7th and 35th micro-canonical steps. The two low 
lying eigenvectors mixed at the 19th micro-canonical step, but the eigenvalues did 
not cross. 



4 



is unreasonably large. My own approach so far has been to use a moderate 
amount of smearing, regulate the force to prevent it from becoming too large 
(leading to instabilities and a breakdown of reversibility), and to run short 
trajectories (so that if I do encounter a problem I have not lost too much 
computer time) with a small time-step (which, as shall be made clear later, 
reduces the number of occurrences of the large forces). This allowed me to run 
on small lattices (up to 12 3 24), with the large forces sufficiently infrequent that 
they did not significantly reduce the metropolis acceptance rate. However, as 
the lattice volume is increased, the density of small eigenvalues also increases 
and the time-step would have to be reduced to unmanageable proportions to 
allow acceptance. Also, methods which use multiple times scales [18] com- 
bined methods such as using additional pseudo-fermions to precondition the 
force [19] and RHMC [20] are not as efficient as one might hope for. This is 
because the time-step needed for the integration is determined by the differ- 
ential of the sign function, common to all the terms in the terms in the forces 
constructed in these methods, rather than the condition number of the overlap 
operator. Clearly reducing the time-step as the density of small eigenvalues 
increases is not an optimal solution. 

The reason for these large forces becomes evident once it is realised that equa- 
tion (5) is just the first term in a Taylor expansion in r/(A« — Xj) of the 
mixing angle between the two eigenvectors, which is a function of the gauge 
field, time-step and momenta. Including higher order terms would lead to a 
force that does not conserve area or is not reversible. When r/ (Aj— Xj) is small, 
the expansion is valid, and everything works well. When it is not so small, the 
higher order terms start contributing, leading to an uncalculated and perhaps 
substantial correction to the energy conservation. When it is larger still, the 
series expansion may not converge at all. However, using the exact mixing an- 
gles rather than the expansion would eliminate the large forces. In this paper, 
I describe how this can be done. This approach is not area conserving; but the 
Jacobian can be calculated, and corrected for in the metropolis accept/reject 
step. No account of the Jacobian is made when trying to conserve energy, 
but the size of the Jacobian contribution to the action is 0(r 3 ), the same as 
the normal molecular dynamics energy violations. When the mixing becomes 
large, there will be a large Jacobian, but this is still considerably smaller than 
the action jump caused by the large forces using the old method. This new 
method is not manifestly reversible, but it is possible to construct a reversible 
algorithm by combining forward and backward updates. Stout smearing is 
technically more challenging to apply efficiently with this new method; but it 
is possible. 

Section 2 outlines how a non-area conserving (NAC) HMC can be constructed, 
and describes the calculation of the new fermionic force and the Jacobian. 
Section 3 outlines numerical results comparing this algorithm with the old 
method. Section 4 is a conclusion, and there are two appendices describing 
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some of the more technical details of the proposed algorithm. 



2 Non-area conserving HMC for overlap fermions 

2. 1 Hybrid Monte Carlo 

To fix the notation I start by reviewing the hybrid Monte Carlo algorithm 
for two flavours of fermion [10]. A Monte Carlo method satisfies the detailed 
balance condition 

P[U' <- U\W C [U\ = P[U <- U']W c [Ul (6) 

where VFcft/] is the canonical ensemble and P[U' <— U\ is the probability 
of updating from gauge field U to gauge field U' . In a Hybrid Monte Carlo 
method, we introduce a momentum field II, which contains a Hermitian trace- 
less matrix on every link of the lattice, and which is generated according to a 
Gaussian distribution. We evolve the gauge field and the momentum according 
to a reversible and ergodic trajectory T[U, II]. Finally, we include a metropolis 
step to correct for small changes in the energy E = IT 2 /2 + S g [U] + <p^H~ 2 <f). 
Thus the probability of generating a field U' from a field U, for a canonical 
ensemble, 

Wc [U] = J #t#e- 5 ^-^- 2 ^, (7) 

is 

p[U' ^U]=J dUdU'e-^ n2 5([U',U'}-T[U,U}) 

min (l, e s 9 [u'}-^H- 2 iu']4>-^+s 9 iu]+^H- 2 iu]<P+^-iogj^ ^ ^ 

where the fermion determinant is approximated using a pseudo-fermion field 
0, in the standard way, and J is the Jacobian 

au &n_ 

j _ dU' dU' 

du an 

dW dW 

It is easy to show that this update satisfies the detailed balance condition (6). 
The only non-standard part of equation (8) is the inclusion of the Jacobian [21]. 
Most applications use an area conserving molecular dynamics update, so that 
the logarithm of the Jacobian is zero. However, if it is possible to calculate 
the Jacobian, there is no restriction forcing the use of an area conserving 
algorithm, should an alternative method prove to be advantageous. 



(9) 
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2.2 The new algorithm 



For simplicity, I start by considering a system with two eigenvectors of Q, 
and |^ 2 ), with eigenvalues Ai and A 2 . I intend to differentiate the eigen- 
vector with respect to the gauge field, which requires finding the change in 
the eigenvectors caused by a small change in the gauge field. I write the new 
eigenvectors as 

K> =|^i)cosfl+ |^ 2 >e l5 sin#, 

\tf 2 ) =|-0 2 ) cos 9- \ipi)e- iS sm6. (fO) 

If SQ is the change in the kernel operator Q, and S A the change in the eigen- 
value, then by considering the eigenvalue equations, 

Q =\ , 

(q + sq) m =(a* + *ao m , ai) 

it is easy to show that 



A 1 -A 2 + (Vi|Wi>-(V'2|W2> 1 J 



and 



\l(Vi|<W2>' 



(13) 



Using the usual equation of motion (d/drU = irUU), it is possible to expand 
5Q in r, which gives 

SQxy = ~ iTK"/ 5 J2 [(! ~~ l^^ X )U^x)5 y ^ + ^- 

(1 + ^)Ul(x - fiTl^x - Ai)(J V) x-J • (14) 

The molecular dynamics momentum, II, can be written as 

U ll (x)=^T;(x), (15) 

where T^(x) is a generator of SU(3) (normalised so that Tr T*T J = 5ij) 
on a link proceeding from lattice site x in direction /x, and 7r l:CM is a vector 
representation of the momentum field. Now it is straightforward to express 
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the change in the sign function in terms of the mixing angles 9 and 5: 



F^ AC ' e aij(T, n)vr 

=Wi)Wi\<K) + Mi^HK) - (|^i)(^i|e(Ai) + |^>^ 2 |e(A 2 )) 

=|*><*| sm ( e(A 2 ) - e (A l} )- (t^q^) + J + 

M(H sm *( e ( Al ) - e(A 2 ))- {^^M + ^sQm) + 
M(H cos^ sin 9e-*(e{\ x ) - e(A 2 )) ^|^|^| + 
M(^\ cosgsin^^AO - e(A 2 )) |^]^ ■ (16) 

F^ AC ' e , defined by this equation, shall be used to construct the NAC (non- 
area conserving) fermionic force, are defined below. The terms such as 
(V^I^QlV'i)/ (V^I^QI^i) = 1 have been added for reasons that will be outlined 
in the discussion following equation (20). Note that by expanding equation (16) 
around r = 0, and neglecting terms of order r 2 and higher, one recovers the 
original expression for the derivative of the sign function (equation (5)). But 
when 9 becomes large, giving a large mixing between the two eigenvectors, this 
expansion breaks down. In order to construct a fermionic force from equation 
(16), I require the momentum vectors, 

a™ 1 * = - i/cr(^i| x 7 5 [(l - 'y fl )T™(x)U fl (x)5 y , x+fl — 

(l+ llx )Ul(x)T;(x)5 x , y+lx ]M y , (17) 

which are defined so that 

n nx ^ = (^\SQM. (18) 

The (non-area conserving) fermionic force for this two eigenvalue system is 
thus 

F» ac (x)(t,II) = (X\ [ l5 FjJ Ac ^r,U) + ^ AC ' e (r, n) 7s |x] )a^T;(a;), (19) 

where \X) is the inverse of the overlap operator acting on the pseudo-fermion 
field |0): 

l*> = 7^10), (20) 

and the fermionic force F^x) is defined as the quantity added to the old mo- 
mentum to obtain the new momentum, i.e. II' (x) = IT M (x) + F^(x). F NAC 
refers to the term in F which is constructed from the eigenvectors close 
to zero and not area conserving. The force is usually constructed as F = 
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—iTlfd/dU(S g + <pH~ 2 (f)) + h.c, although in practice any Hermitian traceless 
matrix field which conserves energy and (up to a small, calculable, Jacobian) 
the measure will suffice. The dependence on the molecular dynamics time, r, 
is, in this notation, absorbed into the definition of the force. 

Now the reason why the terms equal to one have been added in equation (16) 
should be clear. We need a construction of the force such that FjJ AC ' e naij 
is indeed proportional to the momentum vectors, so that we can easily ex- 
tract F* AC >\ This requires that the right hand side of (16) is proportional to 
(-0,1 SQ \ipj) for some % and j. Hence we have to introduce the additional terms 
equal to unity. The numerator of these terms provides the a^ik of the left 
hand side of (16). In principle, there is a choice between using 
(ipi | SQ j SQ and \ SQ \ip2) / (VM $Q IV^)- However, we cannot 
introduce terms such as \ SQ \ipi) in the denominator of the force because 
of instabilities when this quantity becomes zero. Since sin 2 6 is proportional to 
(■01 1 5Q IV^) j there are no infinities in the definition of F^ AC ' e given in equa- 
tion (16), although obviously care is needed in its numerical implementation 
to avoid dividing zero by zero. 

Of course, in real life we have more than two eigenvectors. Only the eigen- 
vectors whose eigenvalues are close to zero need to be treated with the NAC 
algorithm. However, all eigenvectors with eigenvalues below a suitable cutoff, 
A, which has to be tuned for each set of simulation parameters, must be dif- 
ferentiated in this way. To include additional eigenvectors in the NAC setup, 
we need to include additional mixing angles. To simplify the expressions, I 
assume that only one mixing angle is large at any time, so that I can write 
the new eigenvector as 



where the sum runs over all eigenvectors with eigenvalues below the cutoff. 
If there is more than one large mixing angle the new eigenvector defined in 
equation (21) is no longer normalised. Although this problem has not occurred 
in my tests, the solution would be to use the full expansion in terms of Euler 
angles. For example, for three eigenvectors we would write 



K> = cos 12 cos 13 IVi) + cos 9 13 sin 9 12 e tSl2 ^2} + sin 9 13 e tSl * |^ 3 > • (22) 



For equation (21), the mixing angles calculated in equations (12) and (13) 
can be used. If equation (22) is used, it would be necessary to derive new 
expressions for the mixing angles. 




(21) 
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The non area-conserving force is a function of the momenta and is not an 
odd function of the time. Therefore, to ensure reversibility it is necessary to 
update the momentum field in two steps: 

n 0.5 = n + pAC Q + pNAC g ? n 0.5^j ? 

n 0.5 = n l + pAC + pNAC n 0.5^j (33) 

The first step requires an iterative procedure. This iteration does not signifi- 
cantly slow down the HMC algorithm because the time-consuming parts of the 
force calculation, including the overlap inversions, eigenvalue calculation and 
the calculation of the momentum vectors o;^, are the same for each iteration 
and thus only need to be computed once for each calculation of the force. The 
iteration always converged to numerical precision within three or four steps. 
Given that the force is a highly non-linear function of the momentum, there 
is a danger that there may be multiple solutions to the iteration or chaotic 
effects. For this reason, the reversibility must be carefully checked. My numer- 
ical results on 8 3 16 lattices are given in section 3.1, and show no breakdown 
in reversibility across a large range of molecular dynamics time-steps. 

Because this momentum update is not area conserving, two Jacobians must be 
calculated, one for each of the updates in equation (23). Both Jacobians can 
be computed using the same method. Since only the momentum is updated, 
dU' /dll = 1 and dU' /dll = 0. Therefore, to calculate the Jacobian we need 
to calculate only dll' /dll. For the second update in (23), this is 



OTY l 



. dir 3 % 



:^ AC ' e (-r/2,n°- 5 ) 75 



O7r 0.5 

5 J '5^ 5 y OL n //fiP^p A nm ^ p. (24) 

I obtain the second equality by noting that the only momentum dependence 
within F is contained in terms such as \ 5Q which, when differentiated, 
gives terms proportional to aij. By rewriting the vectors in terms of a 
complete orthonormal basis a' k , so that OLijOLnmMj^m = ot' k a\ '< A' kl , it is easy to 
calculate the Jacobian in terms of the small matrix A': 

J = det[l - A']. (25) 

For sufficiently large eigenvalues, the logarithm of the Jacobian should scale 
as 0(r 3 ) for each molecular dynamics step. This is the same as the change in 
the energy. The easiest way to see this is to note that the molecular dynamics 
update is reversible, which means that the logarithm of the Jacobian must 
be an odd function of time. Furthermore, at O(r) this method is identical to 



10 



the old area conserving algorithm; therefore the highest order term which can 
contribute to the Jacobian is 0(r 3 ). This is seen numerically in section 3.2. 

Of course, if the eigenvalues are small, and the Taylor expansion of sin 6 in 
rj (Ai — A 2 ) does not converge, then it is possible to get large Jacobians, just 
as large forces blighted the old method. However, this method offers several 
advantages. Firstly, the change in the logarithm of the Jacobian scales as 
0(log(r/ (Ai — A2))), rather than a fermionic force (and thus change in kinetic 
energy) scaling as 0(r/(Ai — A 2 )). Secondly the absence of large fermionic 
forces improves the stability of the algorithm (a small numerical error in a large 
force could lead to a large error in the energy). Finally, because the trajectory is 
smooth, there is a possibility of cancellations between a large positive Jacobian 
as the eigenvalues approach and a negative Jacobian as they depart; while with 
the old method the large force focused on one eigenvector meant that that 
eigenvector changed rapidly, leaving no opportunity for any cancellation. In 
our numerical tests on 8 3 16 lattices I did not see any logarithms of Jacobians 
larger than 0.4 even at relatively large time steps. Energy violations of order 
100 or higher were common with the old algorithm. These results will be 
discussed in section 3.3. 

In this paper, I have presented the method without any smearing, and it is 
not my intention to describe the smeared version of the algorithm in detail. 
However it is prudent to make a few comments. I have adapted and successfully 
run a version of this algorithm including stout smearing. From [16] , equation 
(71), I obtained an expression deriving the differential of the smeared link 
with respect to the differential of the original link. To calculate the vectors 
ck™^, I simply applied this expression to the derivative of the gauge field, 
iT l U . Equation (72) of [16], which is normally used to calculate the smeared 
force, cannot be used because we need to efficiently calculate the Jacobian. 
While this approach can almost certainly be improved, it worked. Efficiently 
parallelising the code required adapting the algorithm so that it could calculate 
the differential of links separated by sufficient distance (twice the number of 
smearing steps plus one link) simultaneously. This procedure is acceptably 
quick for one or two smearing steps, but is more costly for larger iterations of 
smearing. 



3 Numerical results 

I tested the algorithm on a 8 3 16 ensemble with mass \x = 0.05, (3 = 8.35 with 
a tadpole improved Luscher-Weisz gauge action [22,23,24,25], k = 0.2, and no 
additional pseudo-fermions. In order to test the routine in the most extreme 
conditions possible on these lattices I did not use any stout smearing. In an 
actual HMC simulation, I would, of course, use moderate smearing to remove 
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Fig. 2. Test of the reversibility. The plot shows the difference between the initial en- 
ergy and the energy after running a forwards and backwards trajectory, normalised 
by the initial energy. 

dislocations. 

I will test the reversibility of the algorithm (section 3.1), whether the Jacobian 
is sufficiently small to leave the acceptance rate unaffected, whether it scales 
with the molecular dynamics time as predicted (section 3.2), and whether the 
new algorithm is indeed successful in eliminating the large forces (section 3.3). 

3. 1 Test of reversibility 

To test that the algorithm is reversible, I ran forward and backward trajec- 
tories of length ten micro-canonical steps for twenty 8 3 16 fi = 0.05 configu- 
rations, and calculated the difference between the initial and final energies. 
I tested time steps between 5t = 0.001 and 0.03, and the average difference 
in the initial and final energies are plotted in figure 2. I see no breakdown in 
reversibility at any of these timescales (the energy differences are consistent 
with the accuracy which I use when inverting the overlap operator). I have 
also checked the reversibility by comparing the smallest Wilson eigenvalues 
during the forward and reverse trajectories; and again, there was no sign of a 
breakdown of reversibility to the working precision. 

3.2 Scaling of Jacobian 

To confirm that the Jacobian scales as expected with the molecular dynamics 
time, on the same configurations used in section 3.1, I calculated the average 
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Fig. 3. The average change in the logarithm of the Jacobian for each micro-canonical 
step as a function of molecular dynamics time. 
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(|AlogJ|) 


(A log J) 


max A log J 


0.001 


4.0(6) x 10~ 6 


1.5(9) x 10~ 6 


3.14 x 10~ 5 


0.005 


4.1(4)- 4 


0.9(52) x 10~ 5 


3.11 x 10 -3 


0.008 


2.1(3) x 10~ 3 


-0.2(39) x 10~ 4 


0.02517 


0.01 


4.2(7) x 10~ 3 


-4.0(76) x 10- 4 


0.0636 


0.012 


6.7(9) x 10- 3 


-1.8(10) x 10~ 3 


0.0818 


0.014 


9.7(15) x 10~ 3 


-2.0(16) x 10- 3 


0.0956 


0.016 


0.016(2) 


-5.8(23) x 10- 3 


0.139 


0.02 


0.090(36) 


-0.018(39) 


0.187 


0.03 


0.095(44) 


0.013(51) 


0.343 



Table 1 

The average change in the absolute value of the logarithm of the Jacobian and 
the logarithm of the Jacobian for each micro-canonical step as a function of the 
molecular dynamics time, and the largest change in the Jacobian seen across the 
test trajectories on one micro-canonical step. 

change in the logarithm of the Jacobian, A log J, for each micro-canonical 
step. This average change is plotted against r in figure 3, with the values 
given in table 1. To confirm that the scaling is the expected 0(r 3 ), I fitted 
the results using |AlogJ| = (ar) n , using a and n as free parameters. The 
best fit, with seven degrees of freedom, had a \ 2 value of 5.7. It gave n = 
3.005 ± 0.100, the expected value within the statistical errors. The largest 
change in the logarithm of the Jacobian for a micro-canonical step observed 
during the various test trajectories was 0.34: not large enough to cause the 
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configuration to be rejected. The logarithm of the Jacobian, as can be seen 
in table 1, is not noticeably biased towards being either positive or negative. 
This means that over the course of a trajectory there will be cancellations 
between positive and negative log J, so that the effect on the acceptance rate 
will be even smaller than might be expected from the 0(r 3 ) scaling. 



3.3 Comparison of fermionic forces 



During my test trajectories, I calculated the fermionic force using both the 
original algorithm and the new algorithm, although I only used the force from 
the new algorithm when updating the momentum. This allowed me to directly 
compare the two forces. From figure 4 it is clear that the new fermionic force 
is stable, while the force from the old algorithm is considerably more unstable. 
The instabilities in the old algorithm fermionic force are, of course, exaggerated 
compared to a production run because I am not using any smearing (note that 
the eigenvalue scale in figure 1, based on data taken from a production run 
which used two levels of stout smearing, is a factor of ten larger than the scale 
in figure 4). However, I expect the picture from figure 4 to be duplicated on 
larger lattices with smearing, because the density of smaller eigenvalues would 
increase. None of my test trajectories had any peaks in the fermionic force. 
As mentioned earlier, and as can be seen from the bottom plot in figure 4, I 
did see peaks in the Jacobians caused by the mixing (as expected), but these 
were not large enough to reduce the metropolis acceptance rate. 



4 Conclusion 



I have presented a new method to differentiate the eigenvectors of the ker- 
nel operator in an Hybrid Monte Carlo algorithm with overlap fermions. This 
new algorithm is reversible, scales well with the molecular dynamics time, is 
no slower to compute than the old algorithm (unless an excessive number of 
smearing steps are used), and, unlike the old algorithm, has no large peaks 
in the fermionic force. The method can easily be extended to variants of the 
HMC algorithm, such as RHMC, using multiple pseudo-fermion, or differen- 
tiable smearing. I therefore recommend that this new method is used in future 
dynamical overlap calculations which allow small kernel eigenvalues. 
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Fig. 4. Comparison of the trace of the square of the fermionic forces for the proposed 
and old algorithms with r = 0.016 on one of the \x = 0.05 trajectories (top), together 
with the Wilson operator eigenvalues (middle) and the log of the Jacobian (bottom). 
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A Calculation of force and Jacobian 



In this appendix, for simplicity I concentrate on the force and Jacobian from 
the mixing of one eigenvector pair. However, to illuminate the generalisation to 
the multiple eigenvector case of equation(21), and to avoid confusion between 
9ij and 9ji, I maintain the notation of equation (21) rather than reverting to 
the simpler notation of equation (10). The argument outlined here can easily 
be extended to include other pairs of eigenvectors. I also only consider the 
momentum update from II 5 to IT 1 , since the fermionic force and Jacobian for 
the update from 11° to IJ 5 can be constructed in the same way. 



I write the force in terms of the momentum vectors 



a™" = - iKT(ipi\ x -f 5 [(l - 7 M )7^(x)C/', 1 (a:)<J y>x+ , 4 - 

(1 + 7m)^W(^)]^-mI^->v. (A- 1 ) 



where T n are the Gell-Mann matrices normalised so that Tr(T"T m ) = 5 nm . 

Neglecting the gauge action and area conserving fermionic action, the energy 
conservation equation for an update from fields [II, U] to [II', U'} reads 



0= I(n /2 -n 2 ) + 



i 



i 



~7T 



,7T 



H[U'Y H[U] 



3 l 



(A.2) 
(A.3) 



where 



ir nxfl =Tr(T™n M (a:)), 



n„(aO =TV 



n nxfi 



(A.4) 
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and 



nxu, 

•r 01- ■ 

=Cij sin 9ij cos %e-^ 

nx/j, 

/•J;'" =C>n%cos^e^^, 

, / nx/i nip N 

-L , w Q „ / Cfc A A Qt A 



.2 



C ab =(1 - /i 2 ) ((X| 75 |V0 <^|X> + (X|^ a ) (^| 75 (e(A0 - e(A j )) , 

5Qab=<^a|5Q|V'6>- (A.5) 

Equation (A. 4) can be used to convert between the vector form of the mo- 
mentum (more useful in this formulation) and the matrix form (used in the 
numerical implementation). 5Q, 9 and 5 are all functions of II. 

Energy is conserved if 

7r ,n ^ =vr n ^ + F™ 1 * + F™» + F^ + F^ (A.6) 
• " (A.7) 

where the coefficients Bij can be determined from equation (A.5). 
To calculate the Jacobian, dir nxiJ - /dn myu I note that 
dSQa m y V 

Q n myu %% ' 

OSQji myv 

Q^myv 3* ' 

myv 

Q^myv l 3 ' 



Q^myv 33 



«5T> (A.8) 



giving 



ZL rv m?/!/ A- m2/I/ rv myv — n myv 

"■i ./< _|_ ^ii \u_ (A 9) 



sin 4% dn myv 25Qji 25Q ij \ - Xj + SQ U - 5Q 



33 



and 



. s d e iSij lx ^i" 

e -«*« — = ~2l » J (A 10) 

<9tt^ 2<5Q, 4 1 ' ; 
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I use these expressions to differentiate Bij, and write the Jacobian in the form 

AAA _ <V lXil ai mvv rA 111 



where 



A .... - 



sin 4% ((Cjj - Co) sin 2% - 2Cjj cos 2%e i5< - ) 

8SQji(Xi - Xj + 5Qu - SQjj) 
(Cjj - Co) (8 sin 2 Oij - sin 2% sin 4%) 



...id 



Cjie iSi i (2 sin 2% - cos 2% sin 4%) 
(Cjj - Co) sin 29ij sin 4% (T^e^ (cos 2% sin 4% - 2 sin 20^) 



A — _ A.. .. 

^3^,33 ~ -^JijM) 



/I.... 



16(<5Q^)Wu) &(SQn)(.SQij) 

>,,iii 

sin 4% ((Ctf - C«) sin 2% - 2C^ cos2%e- l<5 - ) 
85Qij(\i - Aj + - 5(5 jj) 



(Cjj — Cjj) sin 20^ sin40y C^e lJ '(cos 2% sin 4% — 2 sin 2%) 
l " 1 = lGiSQjiXSQij) + S(5Q Jl )(5Q lj ) 
(Cjj - Co) (8 sin 2 Oij - sin 2% sin 4%) 



Cij-e - ^ (2 sin 2% - cos 2% sin 4%) 

s(5Q~p ' 

^y'j'i = ~~ Aij,Hi (A- 12) 

and all other components of A are 0. 
It is my intention to use the standard result 

\1-Aij\ (A. 13) 



1 - a^af^Aij 



to calculate the determinant. There are two things which must be done before 
applying this result. First of all, I have calculated Oi i ja nm A i j tnm not aja]^; 
however since = ar^, the correct expression is obtained by exchanging the 
12 and 21 columns of the matrix Aij jnm calculated above. Secondly, I need to 
re-express the as in terms of an orthonormal basis. It is easy, though only 
necessary in the theoretical proof of equation (A. 13) and not in a numerical 
implementation, to construct other vectors orthonormal to the new as so that 
the basis spans the entire space. 

I can construct an orthonormal basis for by, first of all, expressing the 
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vectors in terms of a single index, and writing 



CCi — >CKi/ y (cti, tti), 

A u -^-An^a^ai), 

An -»iii^(ai,Qi), (A. 14) 

then projecting cci from the other vectors ctj 

x =(«!, a,); 

^ j X Qt ]_ - 

An ^An + x^Aij, 
An ~ > An + xAji, 
Aij -^Aij + xAjj, 

Aji -^Aji + x^Ajj, (A. 15) 

for i 7^ 1 and i ^ j. This procedure can then be repeated for the other 
vectors in turn. Once recast into an orthonormal basis, I can use equation 
(A. 13) to express the Jacobian in terms of the determinant of a small matrix. 
This determinant can then be easily calculated using a standard method, for 
example LU decomposition [26]. 



B The reflection/transmission update 



During the transmission step, which occurs when an eigenvalue of the kernel 
operator crosses zero (and the momentum is sufficiently large that I do not 
have to reflect), I recommend using an momentum update 

n+ =Tl~ + T c (F)-riT c (ri,F) + 

(ifrfa, n- + |(f+ + f-)) + 772(772, n- + |(F+ + F"))) x 



1 + 



do 



\ ( ?7l ,n- + f(F+ + F 



(?72,n- + f (F+ + F~y 



log ( e -(n-,»;) 2 /2-2d + 1 _ e -2d} 

2 . rrr 1 



(n-,77) 2 



+ 



fB.l^ 



d 2 =(-2r c (F", 77)(n-, 77) + 2r c (F+, V )(T1 + , 77) + r c 2 (F" + F+ , F)). (B.2) 

The notation, which is chosen to be consistent with my earlier work, is outlined 
in [5,11] together with the full details of the construction and why I believe it to 
be superior to other algorithms. Here I will limit myself to explaining the most 
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important features of each term. In this formula (in the case where there is no 
smearing, and up to a normalisation factor) 77 = an is a unit momentum vector 
normal to the surface of zero eigenvalue (in the space of all possible gauge field 
configurations), d is half of the change to the momentum energy, 771 and T] 2 are 
arbitrary vectors perpendicular to i] and the force difference F = (F + — F~) — 
|Tr(F + — F~), the — superscript indicates a force or momentum calculated 
with the smallest eigenvalue having its original sign, while + indicates that the 
force or momentum was calculated with the eigenvalue having its final sign, 
and all these quantities are calculated on the gauge field with zero eigenvalue. 
In this section, when referring to in general (for example in the calculation 
of the Jacobians), it should be understood that I am excluding 77 = «„. Even 
with the old algorithm, this update is not area conserving, but it is constructed 
to conserve the action, including the Jacobian term. The d 2 term cancels out 
an O(t) energy difference caused because the momentum is not updated at 
the moment of crossing [5]. The other improvement to the algorithm originally 
published by Zoltan Fodor et al. [4] is in the term proportional to 77, which 
has an increased rate of transmission [11]. 

However, this update is a function of the fermionic force, and by using the non 
area conserving fermionic force proposed in this paper, it is necessary to calcu- 
late the Jacobian for the transmission step. For simplicity, I shall here write F 
as a function of II - , although in practice, to maintain reversibility, it is again 
necessary to update the momentum in two steps, using an iterative procedure 
for one of the updates. First of all, I need to construct an orthonormal basis, 
rji and 772 from the vectors a^, an and two additional vectors, where I 
ensure that 771 and i] 2 are both also normal to the area conserving part of F 
(the non-area conserving part of F is of course proportional to the vectors 
in any case). For convenience, I write that an = an = 77. It is easy to show 
that because 771 and r] 2 are normal to F and all the vectors atij, — 0. 

Similarly, ^ = 0, (except, of course, when i — j — 1) and g^-'^ = 0. 
Thus, I write the Jacobian in the form 



d(n+,ri) d(U+, v ) d(Tl+,r)) 

a(Ti-,n) dcn-^ij) a(Ti-, m ) 

diU+^j) c>(n+,a tJ ) djU+^jj) 

d(U-, v ) d{Ii-,6cij) 9(U-, m ) 

d(n+, m ) d(u+, Vk ) d(n+, m ) 

8(11-, V ) 8(11-, a i:j ) 8(11-, Vk ) 



d(n+,r?) n 
d(Tl-,rj) U 

d(n+,a tJ ) 



7^0 
7^0 








8(11 

1 ri d(Tl+,r) k ) 
T U 8(U~, Vk ) 



(B.3) 



d(U + ,f])/d(U~ and d(U + ,i]i)/d(Il~ ,r]i) have already been calculated in 
[5,11]. All that remains is to calculate the Jacobian for <9(II + , <5j J )/(9(Il + , a^). 
For one of these two half-updates, I obtain 

(n + , dm) =(n - 5 , an) + r c (F(n - 5 ), 

= (n - 5 ,^) + r c ((F(n°- 5 ),a fc ) - (F(n°- 5 ), 77)(77,a fc )). (B.4) 
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Thus, 



3(n+ a*) = + ^U^'l - ^LK^(^)- (b.5) 



And from here, I proceed as before. 
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